Automatic Induction of Finite State Transducers for Simple Phonological Rules
نویسندگان
چکیده
This paper presents a method for learning phonological rules from sample pairs of underlying and surface forms, without negative evidence. The learned rules are represented as finite state transducers that accept underlying forms as input and generate surface forms as output. The algorithm for learning them is an extension of the OSTIA algorithm for learning general subsequential finite state transducers. Although OSTIA is capable of learning arbitrary s.f.s.t’s in the limit, large dictionaries of actual English pronunciations did not give enough samples to correctly induce phonological rules. We then augmented OSTIA with two kinds of knowledge specific to natural language phonology, biases from “universal grammar”. One bias is that underlying phones are often realized as phonetically similar or identical surface phones. The other biases phonological rules to apply across natural phonological classes. The additions helped in learning more compact, accurate, and general transducers than the unmodified OSTIA algorithm. An implementation of the algorithm successfully learns a number of English postlexical rules.
منابع مشابه
Learning Bias and Phonological-Rule Induction
A fundamental debate in the machine learning of language has been the role of prior knowledge in the learning process. Purely nativist approaches, such as the Principles and Parameters model, build parameterized linguistic generalizations directly into the learning system. Purely empirical approaches use a general, domain-independent learning rule (Error Back-Propagation, InstanceBased Generali...
متن کاملFrench large vocabulary recognition with cross-word phonology transducers
word word P(w) (decision tree) model Although finite-state transducers have been widely used in linguistics, their application to speech recognition has begun only recently [I]. We describe our implementation of French large vocabulary recognition based on transducers, and how we take advantage of this approach to integrate automatic pronunciation rules and cross-word phenomena such as French '...
متن کاملAn efficient implementation of phonological rules using finite-state transducers
Context-dependent phonological rules are used to model the mapping from phonemes to their varied phonetic surface realizations. Others, most notably Kaplan and Kay, have described how to compile general context-dependent phonological rewrite rules into finite-state transducers. Such rules are very powerful, but their compilation is complex and can result in very large nondeterministic automata....
متن کاملThe Use of Finite-state Transducers for Modeling Phonological and Morphological Constraints in Automatic Speech Recognition
It has been shown in [3] that modeling cross-word phonological constraints in the recognition network may decrease the sentence error rate by 12% in a connected digit recognition task. Modeling of these constraints is supposed to be important in large vocabulary recognition (LVCSR) as well. Up to now, however, it was not possible to evaluate this technique in LVCSR due to the lack of a method t...
متن کاملInitial Results on Wrapping Semistructured Web Pages with Finite-State Transducers and Contextual Rules
This paper presents SoftMealy, a novel Web wrapper representation formalism. This representation is based on a finite-state transducer (FST) and contextual rules, which allow a wrapper to wrap semistructured Web pages containing missing attributes, multiple attribute values, variant attribute permutations, exceptions and typos, the features that no previous work can handle. A SoftMealy wrapper ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1995